shaping belief state
Author Response for ' Shaping Belief States with Generative Environment Models for RL '
We are grateful to all constructive and actionable feedback provided by the reviewers. We believe to have addressed the key concerns raised by the reviewers below. 's concerns with our main hypothesis as it has not We are working to improve our explanations in section 2.2 based on all feedback We emphasize that careful empirical experimentation in ML can also bring valuable insights to the community. Studying these factors require an intersectional empirical study such as this paper. Probabilistic models benefit more from overshoot than Deterministic models.
Shaping Belief States with Generative Environment Models for RL
When agents interact with a complex environment, they must form and maintain beliefs about the relevant aspects of that environment. We propose a way to efficiently train expressive generative models in complex environments. We show that a predictive algorithm with an expressive generative model can form stable belief-states in visually rich and dynamic 3D environments. More precisely, we show that the learned representation captures the layout of the environment as well as the position and orientation of the agent. Our experiments show that the model substantially improves data-efficiency on a number of reinforcement learning (RL) tasks compared with strong model-free baseline agents. We find that predicting multiple steps into the future (overshooting), in combination with an expressive generative model, is critical for stable representations to emerge. In practice, using expressive generative models in RL is computationally expensive and we propose a scheme to reduce this computational burden, allowing us to build agents that are competitive with model-free baselines.
Reviews: Shaping Belief States with Generative Environment Models for RL
Post rebuttal update: I appreciate the additional explanation for need of overshooting in empirical methods, and the clarity of response regarding stochastic models. The issue I took was with Sec 2.2, that next-step prediction is insufficient to produce belief states, which is only an issue with approximation error when dealing with empirical results. This is not clearly explained in the paper, but clarified much more nicely in the rebuttal. This would cause me to raise my score from a 3 to a 4 for the misunderstanding, but I still do not find this paper worthy of acceptance. I don't think they are particularly surprising insights, and it seems the sole merit of this paper is an empirical one, and impressive because of performance on complex tasks.
Reviews: Shaping Belief States with Generative Environment Models for RL
This paper examines the use of generative models for developing representations to improve data efficiency in RL. Specifically, the authors use a generative model that is trained to predict multiple frames into the future (overshooting), and they show that when the model is stochastic (but not deterministic), overshooting leads to useful representations of the environment that can improve RL efficiency. The reviews on this paper were fairly divergent in the first round. Two of the reviewers liked this paper, but one did not feel it provided truly novel contributions, and only brought together previously proposed ideas for using predictive training to improve RL representations. In discussion, the reviewers came to the conclusion that it does demonstrate the utility of overshoot prediction for stochastic models and that an empirical demonstration like this can be useful.
Shaping Belief States with Generative Environment Models for RL
When agents interact with a complex environment, they must form and maintain beliefs about the relevant aspects of that environment. We propose a way to efficiently train expressive generative models in complex environments. We show that a predictive algorithm with an expressive generative model can form stable belief-states in visually rich and dynamic 3D environments. More precisely, we show that the learned representation captures the layout of the environment as well as the position and orientation of the agent. Our experiments show that the model substantially improves data-efficiency on a number of reinforcement learning (RL) tasks compared with strong model-free baseline agents.
Shaping Belief States with Generative Environment Models for RL
Gregor, Karol, Rezende, Danilo Jimenez, Besse, Frederic, Wu, Yan, Merzic, Hamza, Oord, Aaron van den
When agents interact with a complex environment, they must form and maintain beliefs about the relevant aspects of that environment. We propose a way to efficiently train expressive generative models in complex environments. We show that a predictive algorithm with an expressive generative model can form stable belief-states in visually rich and dynamic 3D environments. More precisely, we show that the learned representation captures the layout of the environment as well as the position and orientation of the agent. Our experiments show that the model substantially improves data-efficiency on a number of reinforcement learning (RL) tasks compared with strong model-free baseline agents.